Overview

Dataset statistics

Number of variables35
Number of observations838
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory229.3 KiB
Average record size in memory280.2 B

Variable types

Numeric9
Categorical26

Alerts

STDs_cervical_condylomatosis has constant value ""Constant
STDs_AIDS has constant value ""Constant
Age is highly overall correlated with No_pregnanciesHigh correlation
Biopsy is highly overall correlated with Hinselmann and 1 other fieldsHigh correlation
Dx is highly overall correlated with Dx_CIN and 2 other fieldsHigh correlation
Dx_CIN is highly overall correlated with DxHigh correlation
Dx_Cancer is highly overall correlated with Dx and 1 other fieldsHigh correlation
Dx_HPV is highly overall correlated with Dx and 1 other fieldsHigh correlation
Hinselmann is highly overall correlated with Biopsy and 1 other fieldsHigh correlation
Hormonal_Contraceptives is highly overall correlated with Hormonal_Contraceptives_yearsHigh correlation
Hormonal_Contraceptives_years is highly overall correlated with Hormonal_ContraceptivesHigh correlation
IUD is highly overall correlated with IUD_yearsHigh correlation
IUD_years is highly overall correlated with IUDHigh correlation
No_pregnancies is highly overall correlated with AgeHigh correlation
STDs is highly overall correlated with STDs_No_of_diagnosis and 3 other fieldsHigh correlation
STDs_HIV is highly overall correlated with STDs_No_of_diagnosis and 1 other fieldsHigh correlation
STDs_No_of_diagnosis is highly overall correlated with STDs and 4 other fieldsHigh correlation
STDs_condylomatosis is highly overall correlated with STDs and 3 other fieldsHigh correlation
STDs_number is highly overall correlated with STDs and 6 other fieldsHigh correlation
STDs_syphilis is highly overall correlated with STDs_numberHigh correlation
STDs_vaginal_condylomatosis is highly overall correlated with STDs_numberHigh correlation
STDs_vulvo_perineal_condylomatosis is highly overall correlated with STDs and 3 other fieldsHigh correlation
Schiller is highly overall correlated with Biopsy and 1 other fieldsHigh correlation
Smokes is highly overall correlated with Smokes_packs_yr and 1 other fieldsHigh correlation
Smokes_packs_yr is highly overall correlated with Smokes and 1 other fieldsHigh correlation
Smokes_yrs is highly overall correlated with Smokes and 1 other fieldsHigh correlation
IUD is highly imbalanced (54.2%)Imbalance
STDs is highly imbalanced (56.1%)Imbalance
STDs_number is highly imbalanced (75.6%)Imbalance
STDs_condylomatosis is highly imbalanced (71.3%)Imbalance
STDs_vaginal_condylomatosis is highly imbalanced (95.6%)Imbalance
STDs_vulvo_perineal_condylomatosis is highly imbalanced (71.8%)Imbalance
STDs_syphilis is highly imbalanced (85.7%)Imbalance
STDs_pelvic_inflammatory_disease is highly imbalanced (98.7%)Imbalance
STDs_genital_herpes is highly imbalanced (98.7%)Imbalance
STDs_molluscum_contagiosum is highly imbalanced (98.7%)Imbalance
STDs_HIV is highly imbalanced (85.7%)Imbalance
STDs_Hepatitis_B is highly imbalanced (98.7%)Imbalance
STDs_HPV is highly imbalanced (97.6%)Imbalance
STDs_No_of_diagnosis is highly imbalanced (78.8%)Imbalance
Dx_Cancer is highly imbalanced (85.7%)Imbalance
Dx_CIN is highly imbalanced (91.4%)Imbalance
Dx_HPV is highly imbalanced (85.7%)Imbalance
Dx is highly imbalanced (81.9%)Imbalance
Hinselmann is highly imbalanced (75.5%)Imbalance
Schiller is highly imbalanced (57.7%)Imbalance
Citology is highly imbalanced (70.8%)Imbalance
Biopsy is highly imbalanced (65.5%)Imbalance
Unnamed: 0 is uniformly distributedUniform
Unnamed: 0 has unique valuesUnique
No_pregnancies has 15 (1.8%) zerosZeros
Smokes_yrs has 716 (85.4%) zerosZeros
Smokes_packs_yr has 716 (85.4%) zerosZeros
Hormonal_Contraceptives_years has 262 (31.3%) zerosZeros
IUD_years has 757 (90.3%) zerosZeros

Reproduction

Analysis started2024-04-07 08:08:54.071750
Analysis finished2024-04-07 08:09:12.675965
Duration18.6 seconds
Software versionydata-profiling vv4.6.5
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ)

UNIFORM  UNIQUE 

Distinct838
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean427.84964
Minimum0
Maximum857
Zeros1
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2024-04-07T13:09:12.853972image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile42.85
Q1215.25
median427.5
Q3639.75
95-th percentile812.15
Maximum857
Range857
Interquartile range (IQR)424.5

Descriptive statistics

Standard deviation246.56236
Coefficient of variation (CV)0.57628273
Kurtosis-1.1905268
Mean427.84964
Median Absolute Deviation (MAD)212.5
Skewness-0.00022970674
Sum358538
Variance60792.998
MonotonicityStrictly increasing
2024-04-07T13:09:13.055935image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1
 
0.1%
534 1
 
0.1%
563 1
 
0.1%
565 1
 
0.1%
566 1
 
0.1%
567 1
 
0.1%
568 1
 
0.1%
569 1
 
0.1%
570 1
 
0.1%
571 1
 
0.1%
Other values (828) 828
98.8%
ValueCountFrequency (%)
0 1
0.1%
1 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
857 1
0.1%
856 1
0.1%
855 1
0.1%
854 1
0.1%
853 1
0.1%
852 1
0.1%
851 1
0.1%
850 1
0.1%
849 1
0.1%
848 1
0.1%

Age
Real number (ℝ)

HIGH CORRELATION 

Distinct44
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.812649
Minimum13
Maximum84
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2024-04-07T13:09:13.241739image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum13
5-th percentile16
Q120
median25
Q332
95-th percentile41
Maximum84
Range71
Interquartile range (IQR)12

Descriptive statistics

Standard deviation8.5292095
Coefficient of variation (CV)0.31810395
Kurtosis4.8354724
Mean26.812649
Median Absolute Deviation (MAD)5
Skewness1.4139951
Sum22469
Variance72.747414
MonotonicityNot monotonic
2024-04-07T13:09:13.435099image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
23 54
 
6.4%
18 48
 
5.7%
21 45
 
5.4%
20 45
 
5.4%
19 44
 
5.3%
24 38
 
4.5%
26 37
 
4.4%
28 37
 
4.4%
25 36
 
4.3%
17 35
 
4.2%
Other values (34) 419
50.0%
ValueCountFrequency (%)
13 1
 
0.1%
14 5
 
0.6%
15 21
2.5%
16 21
2.5%
17 35
4.2%
18 48
5.7%
19 44
5.3%
20 45
5.4%
21 45
5.4%
22 29
3.5%
ValueCountFrequency (%)
84 1
0.1%
79 1
0.1%
70 2
0.2%
59 1
0.1%
52 2
0.2%
51 1
0.1%
50 1
0.1%
49 2
0.2%
48 2
0.2%
47 1
0.1%

No_of_sex_partner
Real number (ℝ)

Distinct11
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.5077566
Minimum1
Maximum28
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2024-04-07T13:09:13.600307image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q33
95-th percentile5
Maximum28
Range27
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.588188
Coefficient of variation (CV)0.63331028
Kurtosis79.540646
Mean2.5077566
Median Absolute Deviation (MAD)1
Skewness5.6513044
Sum2101.5
Variance2.5223412
MonotonicityNot monotonic
2024-04-07T13:09:13.767058image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
2 280
33.4%
3 209
24.9%
1 203
24.2%
4 79
 
9.4%
5 44
 
5.3%
6 9
 
1.1%
7 7
 
0.8%
8 4
 
0.5%
10 1
 
0.1%
2.5 1
 
0.1%
ValueCountFrequency (%)
1 203
24.2%
2 280
33.4%
2.5 1
 
0.1%
3 209
24.9%
4 79
 
9.4%
5 44
 
5.3%
6 9
 
1.1%
7 7
 
0.8%
8 4
 
0.5%
10 1
 
0.1%
ValueCountFrequency (%)
28 1
 
0.1%
10 1
 
0.1%
8 4
 
0.5%
7 7
 
0.8%
6 9
 
1.1%
5 44
 
5.3%
4 79
 
9.4%
3 209
24.9%
2.5 1
 
0.1%
2 280
33.4%

First_sexual_intercourse
Real number (ℝ)

Distinct21
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.99642
Minimum10
Maximum32
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2024-04-07T13:09:13.928993image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile14
Q115
median17
Q318
95-th percentile22
Maximum32
Range22
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.8129648
Coefficient of variation (CV)0.16550337
Kurtosis4.2813604
Mean16.99642
Median Absolute Deviation (MAD)2
Skewness1.5680882
Sum14243
Variance7.9127709
MonotonicityNot monotonic
2024-04-07T13:09:14.086986image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
15 162
19.3%
17 150
17.9%
18 134
16.0%
16 116
13.8%
14 79
9.4%
19 59
 
7.0%
20 37
 
4.4%
13 24
 
2.9%
21 20
 
2.4%
23 9
 
1.1%
Other values (11) 48
 
5.7%
ValueCountFrequency (%)
10 2
 
0.2%
11 2
 
0.2%
12 6
 
0.7%
13 24
 
2.9%
14 79
9.4%
15 162
19.3%
16 116
13.8%
17 150
17.9%
18 134
16.0%
19 59
 
7.0%
ValueCountFrequency (%)
32 1
 
0.1%
29 5
 
0.6%
28 3
 
0.4%
27 6
 
0.7%
26 7
 
0.8%
25 2
 
0.2%
24 6
 
0.7%
23 9
1.1%
22 8
 
1.0%
21 20
2.4%

No_pregnancies
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct11
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.2708831
Minimum0
Maximum11
Zeros15
Zeros (%)1.8%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2024-04-07T13:09:14.231698image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q33
95-th percentile5
Maximum11
Range11
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.4515305
Coefficient of variation (CV)0.63919211
Kurtosis3.2304941
Mean2.2708831
Median Absolute Deviation (MAD)1
Skewness1.4454554
Sum1903
Variance2.1069409
MonotonicityNot monotonic
2024-04-07T13:09:14.376128image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1 289
34.5%
2 247
29.5%
3 144
17.2%
4 77
 
9.2%
5 37
 
4.4%
6 18
 
2.1%
0 15
 
1.8%
7 6
 
0.7%
8 3
 
0.4%
11 1
 
0.1%
ValueCountFrequency (%)
0 15
 
1.8%
1 289
34.5%
2 247
29.5%
3 144
17.2%
4 77
 
9.2%
5 37
 
4.4%
6 18
 
2.1%
7 6
 
0.7%
8 3
 
0.4%
10 1
 
0.1%
ValueCountFrequency (%)
11 1
 
0.1%
10 1
 
0.1%
8 3
 
0.4%
7 6
 
0.7%
6 18
 
2.1%
5 37
 
4.4%
4 77
 
9.2%
3 144
17.2%
2 247
29.5%
1 289
34.5%

Smokes
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
716 
1.0
122 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row1.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 716
85.4%
1.0 122
 
14.6%

Length

2024-04-07T13:09:14.532846image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:14.687188image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 716
85.4%
1.0 122
 
14.6%

Most occurring characters

ValueCountFrequency (%)
0 1554
61.8%
. 838
33.3%
1 122
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1554
92.7%
1 122
 
7.3%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1554
61.8%
. 838
33.3%
1 122
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1554
61.8%
. 838
33.3%
1 122
 
4.9%

Smokes_yrs
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct30
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2167835
Minimum0
Maximum37
Zeros716
Zeros (%)85.4%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2024-04-07T13:09:14.839173image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile9.15
Maximum37
Range37
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4.0908358
Coefficient of variation (CV)3.3620079
Kurtosis23.919463
Mean1.2167835
Median Absolute Deviation (MAD)0
Skewness4.4828998
Sum1019.6646
Variance16.734938
MonotonicityNot monotonic
2024-04-07T13:09:15.013757image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
0 716
85.4%
1.266972909 15
 
1.8%
5 9
 
1.1%
9 9
 
1.1%
1 8
 
1.0%
3 7
 
0.8%
2 7
 
0.8%
16 6
 
0.7%
7 6
 
0.7%
8 6
 
0.7%
Other values (20) 49
 
5.8%
ValueCountFrequency (%)
0 716
85.4%
0.16 1
 
0.1%
0.5 3
 
0.4%
1 8
 
1.0%
1.266972909 15
 
1.8%
2 7
 
0.8%
3 7
 
0.8%
4 5
 
0.6%
5 9
 
1.1%
6 4
 
0.5%
ValueCountFrequency (%)
37 1
 
0.1%
34 1
 
0.1%
32 1
 
0.1%
28 1
 
0.1%
24 1
 
0.1%
22 2
0.2%
21 1
 
0.1%
20 1
 
0.1%
19 3
0.4%
18 1
 
0.1%

Smokes_packs_yr
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct61
Distinct (%)7.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.45036592
Minimum0
Maximum37
Zeros716
Zeros (%)85.4%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2024-04-07T13:09:15.203108image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2.415
Maximum37
Range37
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.2287538
Coefficient of variation (CV)4.9487622
Kurtosis115.39083
Mean0.45036592
Median Absolute Deviation (MAD)0
Skewness9.3496831
Sum377.40664
Variance4.9673435
MonotonicityNot monotonic
2024-04-07T13:09:15.404818image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 716
85.4%
0.5132021277 18
 
2.1%
1 6
 
0.7%
3 5
 
0.6%
0.75 4
 
0.5%
0.05 4
 
0.5%
1.2 4
 
0.5%
0.2 4
 
0.5%
2 4
 
0.5%
0.1 3
 
0.4%
Other values (51) 70
 
8.4%
ValueCountFrequency (%)
0 716
85.4%
0.001 1
 
0.1%
0.003 1
 
0.1%
0.025 1
 
0.1%
0.04 2
 
0.2%
0.05 4
 
0.5%
0.1 3
 
0.4%
0.15 1
 
0.1%
0.16 2
 
0.2%
0.2 4
 
0.5%
ValueCountFrequency (%)
37 1
 
0.1%
22 1
 
0.1%
21 1
 
0.1%
19 1
 
0.1%
15 1
 
0.1%
12 3
0.4%
9 2
0.2%
8 2
0.2%
7.6 1
 
0.1%
7.5 1
 
0.1%

Hormonal_Contraceptives
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
1.0
576 
0.0
262 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row1.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
1.0 576
68.7%
0.0 262
31.3%

Length

2024-04-07T13:09:15.584146image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:15.714921image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
1.0 576
68.7%
0.0 262
31.3%

Most occurring characters

ValueCountFrequency (%)
0 1100
43.8%
. 838
33.3%
1 576
22.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1100
65.6%
1 576
34.4%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1100
43.8%
. 838
33.3%
1 576
22.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1100
43.8%
. 838
33.3%
1 576
22.9%

Hormonal_Contraceptives_years
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct45
Distinct (%)5.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.3585434
Minimum0
Maximum30
Zeros262
Zeros (%)31.3%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2024-04-07T13:09:15.873656image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.58
Q33
95-th percentile10
Maximum30
Range30
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.809446
Coefficient of variation (CV)1.6151689
Kurtosis8.4338841
Mean2.3585434
Median Absolute Deviation (MAD)0.58
Skewness2.5547813
Sum1976.4594
Variance14.511879
MonotonicityNot monotonic
2024-04-07T13:09:16.057797image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
0 262
31.3%
1 98
 
11.7%
0.25 45
 
5.4%
2 44
 
5.3%
3 42
 
5.0%
5 39
 
4.7%
0.5 35
 
4.2%
0.08 33
 
3.9%
6 28
 
3.3%
7 25
 
3.0%
Other values (35) 187
22.3%
ValueCountFrequency (%)
0 262
31.3%
0.08 33
 
3.9%
0.16 20
 
2.4%
0.17 1
 
0.1%
0.25 45
 
5.4%
0.33 10
 
1.2%
0.41 1
 
0.1%
0.42 10
 
1.2%
0.5 35
 
4.2%
0.58 7
 
0.8%
ValueCountFrequency (%)
30 1
 
0.1%
22 1
 
0.1%
20 6
0.7%
19 2
 
0.2%
17 1
 
0.1%
16 2
 
0.2%
15 6
0.7%
14 4
0.5%
13 2
 
0.2%
12 4
0.5%

IUD
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
757 
1.0
81 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 757
90.3%
1.0 81
 
9.7%

Length

2024-04-07T13:09:16.222482image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:16.358343image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 757
90.3%
1.0 81
 
9.7%

Most occurring characters

ValueCountFrequency (%)
0 1595
63.4%
. 838
33.3%
1 81
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1595
95.2%
1 81
 
4.8%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1595
63.4%
. 838
33.3%
1 81
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1595
63.4%
. 838
33.3%
1 81
 
3.2%

IUD_years
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct26
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.44566826
Minimum0
Maximum19
Zeros757
Zeros (%)90.3%
Negative0
Negative (%)0.0%
Memory size6.7 KiB
2024-04-07T13:09:16.497232image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3
Maximum19
Range19
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.8237375
Coefficient of variation (CV)4.0921414
Kurtosis35.275773
Mean0.44566826
Median Absolute Deviation (MAD)0
Skewness5.4248305
Sum373.47
Variance3.3260186
MonotonicityNot monotonic
2024-04-07T13:09:16.663593image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
0 757
90.3%
3 11
 
1.3%
5 9
 
1.1%
2 9
 
1.1%
1 8
 
1.0%
8 7
 
0.8%
7 7
 
0.8%
4 5
 
0.6%
6 4
 
0.5%
11 3
 
0.4%
Other values (16) 18
 
2.1%
ValueCountFrequency (%)
0 757
90.3%
0.08 2
 
0.2%
0.16 1
 
0.1%
0.17 1
 
0.1%
0.25 1
 
0.1%
0.33 1
 
0.1%
0.41 1
 
0.1%
0.5 2
 
0.2%
0.58 1
 
0.1%
0.91 1
 
0.1%
ValueCountFrequency (%)
19 1
 
0.1%
17 1
 
0.1%
15 1
 
0.1%
12 1
 
0.1%
11 3
0.4%
10 1
 
0.1%
9 1
 
0.1%
8 7
0.8%
7 7
0.8%
6 4
0.5%

STDs
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
762 
1.0
 
76

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 762
90.9%
1.0 76
 
9.1%

Length

2024-04-07T13:09:16.829609image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:16.963406image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 762
90.9%
1.0 76
 
9.1%

Most occurring characters

ValueCountFrequency (%)
0 1600
63.6%
. 838
33.3%
1 76
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1600
95.5%
1 76
 
4.5%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1600
63.6%
. 838
33.3%
1 76
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1600
63.6%
. 838
33.3%
1 76
 
3.0%

STDs_number
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
762 
2.0
 
36
1.0
 
33
3.0
 
6
4.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 762
90.9%
2.0 36
 
4.3%
1.0 33
 
3.9%
3.0 6
 
0.7%
4.0 1
 
0.1%

Length

2024-04-07T13:09:17.100116image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:17.419587image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 762
90.9%
2.0 36
 
4.3%
1.0 33
 
3.9%
3.0 6
 
0.7%
4.0 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 1600
63.6%
. 838
33.3%
2 36
 
1.4%
1 33
 
1.3%
3 6
 
0.2%
4 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1600
95.5%
2 36
 
2.1%
1 33
 
2.0%
3 6
 
0.4%
4 1
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1600
63.6%
. 838
33.3%
2 36
 
1.4%
1 33
 
1.3%
3 6
 
0.2%
4 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1600
63.6%
. 838
33.3%
2 36
 
1.4%
1 33
 
1.3%
3 6
 
0.2%
4 1
 
< 0.1%

STDs_condylomatosis
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
796 
1.0
 
42

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 796
95.0%
1.0 42
 
5.0%

Length

2024-04-07T13:09:17.573882image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:17.710025image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 796
95.0%
1.0 42
 
5.0%

Most occurring characters

ValueCountFrequency (%)
0 1634
65.0%
. 838
33.3%
1 42
 
1.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1634
97.5%
1 42
 
2.5%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1634
65.0%
. 838
33.3%
1 42
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1634
65.0%
. 838
33.3%
1 42
 
1.7%

STDs_cervical_condylomatosis
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
838 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters2
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 838
100.0%

Length

2024-04-07T13:09:17.852650image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:17.987430image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 838
100.0%

Most occurring characters

ValueCountFrequency (%)
0 1676
66.7%
. 838
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1676
100.0%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1676
66.7%
. 838
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1676
66.7%
. 838
33.3%

STDs_vaginal_condylomatosis
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
834 
1.0
 
4

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 834
99.5%
1.0 4
 
0.5%

Length

2024-04-07T13:09:18.118227image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:18.252991image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 834
99.5%
1.0 4
 
0.5%

Most occurring characters

ValueCountFrequency (%)
0 1672
66.5%
. 838
33.3%
1 4
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1672
99.8%
1 4
 
0.2%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1672
66.5%
. 838
33.3%
1 4
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1672
66.5%
. 838
33.3%
1 4
 
0.2%

STDs_vulvo_perineal_condylomatosis
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
797 
1.0
 
41

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 797
95.1%
1.0 41
 
4.9%

Length

2024-04-07T13:09:18.392601image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:18.525427image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 797
95.1%
1.0 41
 
4.9%

Most occurring characters

ValueCountFrequency (%)
0 1635
65.0%
. 838
33.3%
1 41
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1635
97.6%
1 41
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1635
65.0%
. 838
33.3%
1 41
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1635
65.0%
. 838
33.3%
1 41
 
1.6%

STDs_syphilis
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
821 
1.0
 
17

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 821
98.0%
1.0 17
 
2.0%

Length

2024-04-07T13:09:18.668327image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:18.804659image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 821
98.0%
1.0 17
 
2.0%

Most occurring characters

ValueCountFrequency (%)
0 1659
66.0%
. 838
33.3%
1 17
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1659
99.0%
1 17
 
1.0%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1659
66.0%
. 838
33.3%
1 17
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1659
66.0%
. 838
33.3%
1 17
 
0.7%

STDs_pelvic_inflammatory_disease
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
837 
1.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 837
99.9%
1.0 1
 
0.1%

Length

2024-04-07T13:09:18.948381image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:19.081083image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 837
99.9%
1.0 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 1675
66.6%
. 838
33.3%
1 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1675
99.9%
1 1
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1675
66.6%
. 838
33.3%
1 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1675
66.6%
. 838
33.3%
1 1
 
< 0.1%

STDs_genital_herpes
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
837 
1.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 837
99.9%
1.0 1
 
0.1%

Length

2024-04-07T13:09:19.220980image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:19.354079image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 837
99.9%
1.0 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 1675
66.6%
. 838
33.3%
1 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1675
99.9%
1 1
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1675
66.6%
. 838
33.3%
1 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1675
66.6%
. 838
33.3%
1 1
 
< 0.1%

STDs_molluscum_contagiosum
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
837 
1.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 837
99.9%
1.0 1
 
0.1%

Length

2024-04-07T13:09:19.492851image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:19.629501image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 837
99.9%
1.0 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 1675
66.6%
. 838
33.3%
1 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1675
99.9%
1 1
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1675
66.6%
. 838
33.3%
1 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1675
66.6%
. 838
33.3%
1 1
 
< 0.1%

STDs_AIDS
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
838 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters2
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 838
100.0%

Length

2024-04-07T13:09:19.766418image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:19.896783image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 838
100.0%

Most occurring characters

ValueCountFrequency (%)
0 1676
66.7%
. 838
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1676
100.0%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1676
66.7%
. 838
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1676
66.7%
. 838
33.3%

STDs_HIV
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
821 
1.0
 
17

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 821
98.0%
1.0 17
 
2.0%

Length

2024-04-07T13:09:20.030889image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:20.166350image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 821
98.0%
1.0 17
 
2.0%

Most occurring characters

ValueCountFrequency (%)
0 1659
66.0%
. 838
33.3%
1 17
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1659
99.0%
1 17
 
1.0%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1659
66.0%
. 838
33.3%
1 17
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1659
66.0%
. 838
33.3%
1 17
 
0.7%

STDs_Hepatitis_B
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
837 
1.0
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 837
99.9%
1.0 1
 
0.1%

Length

2024-04-07T13:09:20.304982image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:20.436815image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 837
99.9%
1.0 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 1675
66.6%
. 838
33.3%
1 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1675
99.9%
1 1
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1675
66.6%
. 838
33.3%
1 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1675
66.6%
. 838
33.3%
1 1
 
< 0.1%

STDs_HPV
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0.0
836 
1.0
 
2

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2514
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 836
99.8%
1.0 2
 
0.2%

Length

2024-04-07T13:09:20.584783image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:20.716421image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0.0 836
99.8%
1.0 2
 
0.2%

Most occurring characters

ValueCountFrequency (%)
0 1674
66.6%
. 838
33.3%
1 2
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1676
66.7%
Other Punctuation 838
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1674
99.9%
1 2
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 838
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2514
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1674
66.6%
. 838
33.3%
1 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1674
66.6%
. 838
33.3%
1 2
 
0.1%

STDs_No_of_diagnosis
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0
770 
1
 
66
3
 
1
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters838
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.2%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 770
91.9%
1 66
 
7.9%
3 1
 
0.1%
2 1
 
0.1%

Length

2024-04-07T13:09:20.857705image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:20.998598image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 770
91.9%
1 66
 
7.9%
3 1
 
0.1%
2 1
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 770
91.9%
1 66
 
7.9%
3 1
 
0.1%
2 1
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 838
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 770
91.9%
1 66
 
7.9%
3 1
 
0.1%
2 1
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 838
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 770
91.9%
1 66
 
7.9%
3 1
 
0.1%
2 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 770
91.9%
1 66
 
7.9%
3 1
 
0.1%
2 1
 
0.1%

Dx_Cancer
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0
821 
1
 
17

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters838
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 821
98.0%
1 17
 
2.0%

Length

2024-04-07T13:09:21.151428image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:21.284952image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 821
98.0%
1 17
 
2.0%

Most occurring characters

ValueCountFrequency (%)
0 821
98.0%
1 17
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 838
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 821
98.0%
1 17
 
2.0%

Most occurring scripts

ValueCountFrequency (%)
Common 838
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 821
98.0%
1 17
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 821
98.0%
1 17
 
2.0%

Dx_CIN
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0
829 
1
 
9

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters838
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 829
98.9%
1 9
 
1.1%

Length

2024-04-07T13:09:21.422716image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:21.553999image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 829
98.9%
1 9
 
1.1%

Most occurring characters

ValueCountFrequency (%)
0 829
98.9%
1 9
 
1.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 838
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 829
98.9%
1 9
 
1.1%

Most occurring scripts

ValueCountFrequency (%)
Common 838
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 829
98.9%
1 9
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 829
98.9%
1 9
 
1.1%

Dx_HPV
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0
821 
1
 
17

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters838
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 821
98.0%
1 17
 
2.0%

Length

2024-04-07T13:09:21.694180image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:21.825482image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 821
98.0%
1 17
 
2.0%

Most occurring characters

ValueCountFrequency (%)
0 821
98.0%
1 17
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 838
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 821
98.0%
1 17
 
2.0%

Most occurring scripts

ValueCountFrequency (%)
Common 838
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 821
98.0%
1 17
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 821
98.0%
1 17
 
2.0%

Dx
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0
815 
1
 
23

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters838
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 815
97.3%
1 23
 
2.7%

Length

2024-04-07T13:09:21.961850image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:22.093343image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 815
97.3%
1 23
 
2.7%

Most occurring characters

ValueCountFrequency (%)
0 815
97.3%
1 23
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 838
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 815
97.3%
1 23
 
2.7%

Most occurring scripts

ValueCountFrequency (%)
Common 838
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 815
97.3%
1 23
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 815
97.3%
1 23
 
2.7%

Hinselmann
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0
804 
1
 
34

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters838
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 804
95.9%
1 34
 
4.1%

Length

2024-04-07T13:09:22.229745image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:22.365355image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 804
95.9%
1 34
 
4.1%

Most occurring characters

ValueCountFrequency (%)
0 804
95.9%
1 34
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 838
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 804
95.9%
1 34
 
4.1%

Most occurring scripts

ValueCountFrequency (%)
Common 838
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 804
95.9%
1 34
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 804
95.9%
1 34
 
4.1%

Schiller
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0
766 
1
 
72

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters838
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 766
91.4%
1 72
 
8.6%

Length

2024-04-07T13:09:22.501247image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:22.635657image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 766
91.4%
1 72
 
8.6%

Most occurring characters

ValueCountFrequency (%)
0 766
91.4%
1 72
 
8.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 838
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 766
91.4%
1 72
 
8.6%

Most occurring scripts

ValueCountFrequency (%)
Common 838
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 766
91.4%
1 72
 
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 766
91.4%
1 72
 
8.6%

Citology
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0
795 
1
 
43

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters838
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 795
94.9%
1 43
 
5.1%

Length

2024-04-07T13:09:22.773742image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:22.909416image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 795
94.9%
1 43
 
5.1%

Most occurring characters

ValueCountFrequency (%)
0 795
94.9%
1 43
 
5.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 838
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 795
94.9%
1 43
 
5.1%

Most occurring scripts

ValueCountFrequency (%)
Common 838
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 795
94.9%
1 43
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 795
94.9%
1 43
 
5.1%

Biopsy
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size6.7 KiB
0
784 
1
 
54

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters838
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 784
93.6%
1 54
 
6.4%

Length

2024-04-07T13:09:23.048529image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-07T13:09:23.179739image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
ValueCountFrequency (%)
0 784
93.6%
1 54
 
6.4%

Most occurring characters

ValueCountFrequency (%)
0 784
93.6%
1 54
 
6.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 838
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 784
93.6%
1 54
 
6.4%

Most occurring scripts

ValueCountFrequency (%)
Common 838
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 784
93.6%
1 54
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 838
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 784
93.6%
1 54
 
6.4%

Interactions

2024-04-07T13:09:10.059035image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:08:59.458416image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:00.786142image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:02.103941image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:03.562449image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:04.829055image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:06.100992image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:07.417125image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:08.746475image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:10.197787image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:08:59.621238image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:00.921724image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:02.234058image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:03.690361image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:04.959480image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:06.236088image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:07.558879image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:08.880550image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:10.350577image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:08:59.768288image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:01.070329image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:02.413677image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:03.836948image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:05.113150image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:06.385454image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:07.714506image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:09.033432image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:10.502506image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:08:59.907794image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:01.213813image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:02.551532image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:03.971399image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:05.251458image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:06.527931image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:07.852471image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:09.171542image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:10.639146image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:00.060156image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:01.352815image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:02.686051image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:04.100238image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:05.386083image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:06.662931image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:07.999386image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:09.312445image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:10.781576image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:00.199788image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:01.500469image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:02.828484image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:04.243538image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:05.517277image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:06.804235image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:08.139002image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:09.450890image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:11.107474image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:00.347318image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:01.649083image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:02.975583image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:04.384315image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:05.662560image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:06.949380image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:08.290281image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:09.604641image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:11.257577image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:00.492621image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:01.803595image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:03.119509image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:04.544316image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:05.806266image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:07.100140image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:08.439261image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:09.759910image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:11.407960image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:00.638153image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:01.951905image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:03.411546image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:04.682783image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:05.954740image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:07.265637image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:08.594228image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
2024-04-07T13:09:09.905891image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Correlations

2024-04-07T13:09:23.333493image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
AgeBiopsyCitologyDxDx_CINDx_CancerDx_HPVFirst_sexual_intercourseHinselmannHormonal_ContraceptivesHormonal_Contraceptives_yearsIUDIUD_yearsNo_of_sex_partnerNo_pregnanciesSTDsSTDs_HIVSTDs_HPVSTDs_Hepatitis_BSTDs_No_of_diagnosisSTDs_condylomatosisSTDs_genital_herpesSTDs_molluscum_contagiosumSTDs_numberSTDs_pelvic_inflammatory_diseaseSTDs_syphilisSTDs_vaginal_condylomatosisSTDs_vulvo_perineal_condylomatosisSchillerSmokesSmokes_packs_yrSmokes_yrsUnnamed: 0
Age1.0000.0540.0000.2040.3340.0730.0630.4360.0000.1620.2620.2810.2820.2120.5410.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0470.0000.0000.1020.0250.0570.064-0.224
Biopsy0.0541.0000.3010.1450.0840.1480.1480.0330.5240.0000.0060.0410.0620.0150.0470.1060.1120.0000.0000.1070.0770.0510.0000.1010.0000.0000.0000.0800.7260.0000.0340.034-0.011
Citology0.0000.3011.0000.0690.0000.0950.095-0.0110.1540.000-0.0130.0000.0160.036-0.0420.0350.0520.0000.0000.0420.0470.0000.0000.0280.0000.0000.0000.0490.3420.000-0.004-0.0040.024
Dx0.2040.1450.0691.0000.5840.6230.5710.0620.0470.000-0.0140.1000.1190.0340.0620.0000.0000.0570.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0560.048-0.069-0.0690.074
Dx_CIN0.3340.0840.0000.5841.0000.0000.000-0.0270.0000.0000.0040.0000.0420.0230.0650.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.000-0.043-0.0430.036
Dx_Cancer0.0730.1480.0950.6230.0001.0000.8500.0980.1160.0000.0250.0740.0990.0300.0470.0000.0000.2510.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.1170.000-0.011-0.0060.044
Dx_HPV0.0630.1480.0950.5710.0000.8501.0000.0650.1160.0000.0430.0000.0380.0290.0640.0000.0000.2510.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.1170.0000.0120.0150.017
First_sexual_intercourse0.4360.033-0.0110.062-0.0270.0980.0651.0000.0000.0680.0610.000-0.009-0.128-0.0110.0000.0000.0000.0000.0000.0450.0000.0000.0290.0000.0180.1850.0490.0000.104-0.137-0.133-0.146
Hinselmann0.0000.5240.1540.0470.0000.1160.1160.0001.0000.0000.0170.0290.054-0.0280.0330.0370.0700.0000.0000.1600.0360.0000.0000.1630.0000.0000.0000.0380.6380.0000.0360.039-0.070
Hormonal_Contraceptives0.1620.0000.0000.0000.0000.0000.0000.0680.0001.0000.8160.000-0.0110.0280.1460.0000.0470.0000.0000.0370.0000.0000.0000.0190.0000.0000.0310.0000.0000.000-0.001-0.000-0.067
Hormonal_Contraceptives_years0.2620.006-0.013-0.0140.0040.0250.0430.0610.0170.8161.0000.1530.0210.0490.2590.0130.0000.0930.0000.0000.0000.0000.0000.0000.0000.0760.0000.0000.1510.1070.0630.064-0.119
IUD0.2810.0410.0000.1000.0000.0740.0000.0000.0290.0000.1531.0000.9980.0690.2370.0280.0000.0000.0000.0000.0750.0000.0000.0920.0000.0000.0000.0570.0720.035-0.050-0.048-0.145
IUD_years0.2820.0620.0160.1190.0420.0990.038-0.0090.054-0.0110.0210.9981.0000.0680.2360.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.1730.000-0.049-0.047-0.148
No_of_sex_partner0.2120.0150.0360.0340.0230.0300.029-0.128-0.0280.0280.0490.0690.0681.0000.1720.0000.0000.0000.0000.0000.0250.0000.0400.0000.0400.0000.0000.0270.0000.1770.2470.243-0.051
No_pregnancies0.5410.047-0.0420.0620.0650.0470.064-0.0110.0330.1460.2590.2370.2360.1721.0000.0660.0000.0000.0000.0000.0780.0000.0470.0490.0000.1960.0000.0700.1110.1040.0680.070-0.197
STDs0.0000.1060.0350.0000.0000.0000.0000.0000.0370.0000.0130.0280.0000.0000.0661.0000.4400.1070.0350.9400.7170.0350.0350.9980.0350.4400.1860.7080.1130.1060.1200.121-0.044
STDs_HIV0.0000.1120.0520.0000.0000.0000.0000.0000.0700.0470.0000.0000.0000.0000.0000.4401.0000.0000.1120.5480.0540.0000.0000.6200.0000.0000.0000.0560.1170.0340.0700.068-0.047
STDs_HPV0.0000.0000.0000.0570.0000.2510.2510.0000.0000.0000.0930.0000.0000.0000.0000.1070.0001.0000.0000.0480.0000.0000.0000.2320.0000.0000.0000.0000.0000.0000.0410.0530.081
STDs_Hepatitis_B0.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0350.1120.0001.0000.1020.0000.0000.0000.1480.0000.0000.0000.0000.0000.0030.0940.091-0.005
STDs_No_of_diagnosis0.0000.1070.0420.0000.0000.0000.0000.0000.1600.0370.0000.0000.0000.0000.0000.9400.5480.0480.1021.0000.6980.1020.1020.8230.1020.4360.2290.6880.1510.1030.1180.116-0.033
STDs_condylomatosis0.0000.0770.0470.0000.0000.0000.0000.0450.0360.0000.0000.0750.0000.0250.0780.7170.0540.0000.0000.6981.0000.0000.0000.9860.0000.0000.2600.9750.1100.0400.0620.063-0.009
STDs_genital_herpes0.0000.0510.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0350.0000.0000.0000.1020.0001.0000.0000.1560.0000.0000.0000.0000.0000.000-0.014-0.014-0.015
STDs_molluscum_contagiosum0.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0400.0470.0350.0000.0000.0000.1020.0000.0001.0000.1560.0000.0000.0000.0000.0000.000-0.014-0.0140.031
STDs_number0.0000.1010.0280.0000.0000.0000.0000.0290.1630.0190.0000.0920.0000.0000.0490.9980.6200.2320.1480.8230.9860.1560.1561.0000.1560.6740.6120.9740.1490.1170.1200.120-0.041
STDs_pelvic_inflammatory_disease0.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0400.0000.0350.0000.0000.0000.1020.0000.0000.0000.1561.0000.0000.0000.0000.0000.000-0.014-0.0140.056
STDs_syphilis0.0470.0000.0000.0000.0000.0000.0000.0180.0000.0000.0760.0000.0000.0000.1960.4400.0000.0000.0000.4360.0000.0000.0000.6740.0001.0000.0000.0000.0000.0640.0820.079-0.064
STDs_vaginal_condylomatosis0.0000.0000.0000.0000.0000.0000.0000.1850.0000.0310.0000.0000.0000.0000.0000.1860.0000.0000.0000.2290.2600.0000.0000.6120.0000.0001.0000.1820.0000.0290.0760.0800.040
STDs_vulvo_perineal_condylomatosis0.0000.0800.0490.0000.0000.0000.0000.0490.0380.0000.0000.0570.0000.0270.0700.7080.0560.0000.0000.6880.9750.0000.0000.9740.0000.0000.1821.0000.1130.0430.0650.066-0.014
Schiller0.1020.7260.3420.0560.0000.1170.1170.0000.6380.0000.1510.0720.1730.0000.1110.1130.1170.0000.0000.1510.1100.0000.0000.1490.0000.0000.0000.1131.0000.0340.0580.062-0.022
Smokes0.0250.0000.0000.0480.0000.0000.0000.1040.0000.0000.1070.0350.0000.1770.1040.1060.0340.0000.0030.1030.0400.0000.0000.1170.0000.0640.0290.0430.0341.0000.9960.9960.015
Smokes_packs_yr0.0570.034-0.004-0.069-0.043-0.0110.012-0.1370.036-0.0010.063-0.050-0.0490.2470.0680.1200.0700.0410.0940.1180.062-0.014-0.0140.120-0.0140.0820.0760.0650.0580.9961.0000.9970.008
Smokes_yrs0.0640.034-0.004-0.069-0.043-0.0060.015-0.1330.039-0.0000.064-0.048-0.0470.2430.0700.1210.0680.0530.0910.1160.063-0.014-0.0140.120-0.0140.0790.0800.0660.0620.9960.9971.0000.010
Unnamed: 0-0.224-0.0110.0240.0740.0360.0440.017-0.146-0.070-0.067-0.119-0.145-0.148-0.051-0.197-0.044-0.0470.081-0.005-0.033-0.009-0.0150.031-0.0410.056-0.0640.040-0.014-0.0220.0150.0080.0101.000

Missing values

2024-04-07T13:09:11.696127image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-07T13:09:12.348272image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Unnamed: 0AgeNo_of_sex_partnerFirst_sexual_intercourseNo_pregnanciesSmokesSmokes_yrsSmokes_packs_yrHormonal_ContraceptivesHormonal_Contraceptives_yearsIUDIUD_yearsSTDsSTDs_numberSTDs_condylomatosisSTDs_cervical_condylomatosisSTDs_vaginal_condylomatosisSTDs_vulvo_perineal_condylomatosisSTDs_syphilisSTDs_pelvic_inflammatory_diseaseSTDs_genital_herpesSTDs_molluscum_contagiosumSTDs_AIDSSTDs_HIVSTDs_Hepatitis_BSTDs_HPVSTDs_No_of_diagnosisDx_CancerDx_CINDx_HPVDxHinselmannSchillerCitologyBiopsy
00184.015.01.00.00.0000000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000000
11151.014.01.00.00.0000000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000000
23525.016.04.01.037.00000037.01.03.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0010100000
34463.021.04.00.00.0000000.01.015.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000000
45423.023.02.00.00.0000000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000000
56513.017.06.01.034.0000003.40.00.01.07.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000001101
67261.026.03.00.00.0000000.01.02.01.07.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000000
78451.020.05.00.00.0000000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0010110000
89443.015.08.01.01.2669732.80.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000000
910443.026.04.00.00.0000000.01.02.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000000
Unnamed: 0AgeNo_of_sex_partnerFirst_sexual_intercourseNo_pregnanciesSmokesSmokes_yrsSmokes_packs_yrHormonal_ContraceptivesHormonal_Contraceptives_yearsIUDIUD_yearsSTDsSTDs_numberSTDs_condylomatosisSTDs_cervical_condylomatosisSTDs_vaginal_condylomatosisSTDs_vulvo_perineal_condylomatosisSTDs_syphilisSTDs_pelvic_inflammatory_diseaseSTDs_genital_herpesSTDs_molluscum_contagiosumSTDs_AIDSSTDs_HIVSTDs_Hepatitis_BSTDs_HPVSTDs_No_of_diagnosisDx_CancerDx_CINDx_HPVDxHinselmannSchillerCitologyBiopsy
828848313.018.01.00.00.00.001.00.500.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000000
829849323.018.01.01.011.00.161.06.000.00.01.01.00.00.00.00.00.00.00.00.00.00.00.01.0010100000
830850191.014.00.00.00.00.000.00.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000000
831851232.015.02.00.00.00.000.00.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000000
832852433.017.03.00.00.00.001.05.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000000
833853343.018.00.00.00.00.000.00.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000000
834854322.019.01.00.00.00.001.08.000.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000000
835855252.017.00.00.00.00.001.00.080.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000010
836856332.024.02.00.00.00.001.00.080.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000000
837857292.020.01.00.00.00.001.00.500.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0000000000